Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
null (Ed.)Cooperative Co-evolutionary Algorithms effectively train policies in multiagent systems with a single, statically defined team. However, many real-world problems, such as search and rescue, require agents to operate in multiple teams. When the structure of the team changes, these policies show reduced performance as they were trained to cooperate with only one team. In this work, we solve the cooperation problem by training agents to fill the needs of an arbitrary team, thereby gaining the ability to support a large variety of teams. We introduce Ad hoc Teaming Through Evolution (ATTE) which evolves a limited number of policy types using fitness aggregation across multiple teams. ATTE leverages agent types to reduce the dimensionality of the interaction search space, while fitness aggregation across teams selects for more adaptive policies. In a simulated multi-robot exploration task, ATTE is able to learn policies that are effective in a variety of teaming schemes, improving the performance of CCEA by a factor of up to five times.more » « less
-
null (Ed.)In many real-world multiagent systems, agents must learn diverse tasks and coordinate with other agents. This paper introduces a method to allow heterogeneous agents to specialize and only learn complementary divergent behaviors needed for coordination in a shared environment. We use a hierarchical decomposition of diversity search and fitness optimization to allow agents to speciate and learn diverse temporally extended actions. Within an agent population, diversity in niches is favored. Agents within a niche compete for optimizing the higher level coordination task. Experimental results in a multiagent rover exploration task demonstrate the diversity of acquired agent behavior that promotes coordination.more » « less
-
null (Ed.)Long term robotic deployments are well described by sparse fitness functions, which are hard to learn from and adapt to. This work introduces Adaptive Multi-Fitness Learning (A-MFL), which augments the structure of Multi-Fitness Learning (MFL) [9] by injecting new behaviors into the agents as the environment changes. A-MFL not only improves system performance in dynamic environments, but also avoids undesirable, unforeseen side-effects of new behaviors by localizing where the new behaviors are used. On a simulated multi-robot problem, A-MFL provides up to 90% improvement over MFL, and 100% over a one-step evolutionary approach.more » « less
-
Multiagent teams have been shown to be effective in many domains that require coordination among team members. However, finding valuable joint-actions becomes increasingly difficult in tightly-coupled domains where each agent’s performance depends on the actions of many other agents. Reward shaping partially addresses this challenge by deriving more “tuned" rewards to provide agents with additional feedback, but this approach still relies on agents ran- domly discovering suitable joint-actions. In this work, we introduce Counterfactual Agent Suggestions (CAS) as a method for injecting knowledge into an agent’s learning process within the confines of existing reward structures. We show that CAS enables agent teams to converge towards desired behaviors more reliably. We also show that improvement in team performance in the presence of suggestions extends to large teams and tightly-coupled domains.more » « less
-
In multiagent problems that require complex joint actions, reward shaping methods yield good behavior by incentivizing the agents’ potentially valuable actions. However, reward shaping often requires access to the functional form of the reward function and the global state of the system. In this work, we introduce the Exploratory Gaussian Reward (EGR), a new reward model that creates optimistic stepping stone rewards linking the agents potentially good actions to the desired joint action. EGR models the system reward as a Gaussian Process to leverage the inherent uncertainty in reward estimates that push agents to explore unobserved state space. In the tightly coupled rover coordination problem, we show that EGR significantly outperforms a neural network approximation baseline and is comparable to the system with access to the functional form of the global reward. Finally, we demonstrate how EGR improves performance over other reward shaping methods by forcing agents to explore and escape local optima.more » « less
-
In many multiagent domains, and particularly in tightly coupled domains, teasing an agent’s contribution to the system performance based on a single episodic return is difficult. This well-known difficulty hits state-to-action mapping approaches such as neural net- works trained by evolutionary algorithms particularly hard. This paper introduces fitness critics, which leverage the expected fitness to evaluate an agent’s performance. This approach turns a sparse performance metric (policy evaluation) into a dense performance metric (state-action evaluation) by relating the episodic feedback to the state-action pairs experienced during the execution of that policy. In the tightly-coupled multi-rover domain (where multiple rovers have to perform a particular task simultaneously), only teams using fitness critics were able to demonstrate effective learning on tasks with tight coupling while other coevolved teams were unable to learn at all.more » « less
An official website of the United States government

Full Text Available